1 Project Summary

PI: Example PI name

Institution: Example Institution Name

Department: Example Department

Study Contact: Example Study Contact

Project: Example Project

Study Title: Example Study Title

Hypothesis/Goal: Example Hypothesis

Study Summary: Example Study Summary

1.1 Sample Information

The PI provided 6 Example Sample Type samples.

1.2 Download sample data, raw data, processed data, and statistical test results

The results of statistical tests to identify changed metabolites are provided. In addition, we also include raw, feature-filtered, and normalized metabolomic intensity datasets. Please review the “Read Me” sheet included in the download for a detailed explanations of variables.

1.3 Global metabolomics profiling

Global metabolomics profiling was performed on a Thermo Q-Exactive Orbitrap mass spectrometer with Dionex UHPLC and autosamples. All samples were normalized by total protein content prior to extraction. Samples were analyzed in positive and negative heated electrospray ionization with a mass resolution of 35,000 at m/z 200 as separate injections. Separation was achieved on an ACE 18-pfp 100 x 2.1 mm, 2 µm column with mobile phase A as 0.1% formic acid in water and mobile phase B as acetonitrile. This is a polar embedded stationary phase that provides comprehensive coverage, but does have some limitation is the coverage of very polar species. The flow rate was 350 µL/min with a column temperature of 25°C. 4 µL was injected for negative ions and 2 µL for positive ions.

1.4 Raw data generation

Metabolites were detected in both positive and negative ion modes as some metabolites are better ionized in one mode or the other.

MZmine (freeware) was used to identify features, deisotope, and align features. All adducts and complexes were identified and removed from the data set. The mass and retention time data was searched against our internal metabolite library, and known metabolites were mapped to KEGG IDs.

2 Quality control and data transformation

Blank feature filtering was performed using inner-quartile range filtering as implemented in the R package MetaboAnalystR (https://github.com/xia-lab/MetaboAnalystR)

Missing data were imputed by k-nearest neighbor imputation as implemented in the R package MetaboAnalystR (https://github.com/xia-lab/MetaboAnalystR).

Peak intensities were normalized sample-wise using sum normalization followed by log (base 10) transformation and pareto scaling as implemented R package MetaboAnalystR (https://github.com/xia-lab/MetaboAnalystR). Please note that this method of normalization results in negative values and these are expected in your normalized dataset.

Visualizations to assess the effect of normalization are provided below.

2.1 Normalization Result (Positive ionization mode)

2.2 Normalization Result (Negative ionization mode)

3 Statistical analysis

Multivariate and univariate statistical analysis were performed to assess clustering of samples based on metabolomic profiles and to identify metabolites with changes in abundance between groups or biological conditions.

Following analysis of all compounds detected in positive and negative ionization modes independently, the list of compounds were combined between negative and positive ionization modes and reduced to a single representative compound per likely metabolite based on p-value (lowest p-value compound retrained).

3.1 PCA (multivariate analaysis)

Principal component analysis of untargeted metabolomics data. Two-dimensional PCA score plots reveal possible separation in metabolite profiles related to variables of interest. Ellipses are calculated using the R package car (Fox J. and Weisberg S. 2019) and ~1 Std dev.

3.2 Univariate analysis

For each compound, a t-test was performed using the R package stats (R Core Team 2023) to test the null hypothesis that the mean intensity for group one = the mean intensity for group two. Adjusted p-values are corrected using the FDR method of p-value correction.

## [1] "320 metabolites were significantly changed ( p.value < 0.05) between KO  and  WT"
## Table of significantly changed metabolites

Table of significantly changed metabolites for overall effect of independent variable and for pairwise-contrasts filtered as described above (p-value < 0.05).

To view results for a test of interest, click the arrow at the top of the contrast column to sort by contrast or use the search bar to search for a contrast. To see what metabolites were significantly changed for more than one test, sort by metabolite and see how many contrasts were significant for each metabolite. (Tip: Reverse sorting by Metabolite will display known metabolites first.)

Unknown compounds (those unidentified by our internal metabolite library) were assigned low-confidence metabolite names and KEGG IDs based on mass and the HMDB database as implemented in the R package metid (Shen X 2022). The annotation of these compounds is assigned a confidence value of “3” in downloadable tables and in the report. Compounds identified using our internal library are assigned a confidence value of “1”. Compounds that could not be identified via either method are annotated using their m/z_RT values and the confidence level is blank.

4 Visualizations of Changed Metabolites

All plots below can be zoomed, selected, and downloaded individually (and/or as modified) using the toolbar on the top right of the figure (will appear when you hover your mouse). Hover over plot points to view underlying data.

4.1 Barplots of changed metabolites by metabolite KEGG class

Metabolites with significant changes (p.value<0.05) are shown by metabolite class.

Hover over bars to view significantly changed metabolites for each class. (If a contrast is missing, there were no significantly changed metabolites with KEGG IDs for that contrast)

4.2 Volcano Plots

4.3 Per-contrast boxplot of any of the top changed metabolites across sample class

Per-contrast boxplot of any of the top significantly changed metabolites across sample class

For each contrast, the top changed metabolites are available to view as a boxplot of normalized peak intensity across sample class.

For some contrasts, dummy variables with no data (named beginning with X) have been added to the dropdown menu for ease of plotting. Please ignore.

Keep in mind boxplots display the median value of the data range, not the means (the comparison of which is used to calculate the test statistic and determine significance).

4.4 Heatmaps of changed metabolites

Heatmaps of metabolites with most significant difference in abundance

For each contrast, the top changed metabolites are available to view as a heatmap of normalized peak intensity. Samples (columns) are clustered by the peak intensities for the displayed set of compounds.

Hover and select a subset of metabolites or samples of interest to export a zoomed-in subfigure

4.5 KEGG over representation and sub network analysis

KEGG analysis was performed with the R package FELLA (https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2487-5). Starting from a list of metabolites of interest, FELLA applies a null diffusive process over a network-based representation of the KEGG database and derive a relevant sub-network. The result of this analysis is a list of affected pathways and a graphical sub-pathway representation. The input list of metabolites of interest is indicated by the plot title.

The significantly changed compounds input into the KEGG subnetwork analysis are shown with red squares (see key) and are drawn in their KEGG subnetwork.
Hover over nodes to view annotation and to see the confidence of our identification of a particular input KEGG compound.
(If no subnetworks are visible below, there were no significantly changed metabolites with KEGG IDs.)

Select areas of within this subnetwork to view and export clusters of interest.